Skip to content

Conversation

orlitzky
Copy link
Contributor

@orlitzky orlitzky commented Aug 26, 2025

Speed up a very long test by speeding up a very slow method. Uses a compressed/pickle list of (vertices, edges) that can be passed straight to the graph constructor rather than constructing those vertices and edges on-the-fly.

Dependencies

Copy link

github-actions bot commented Aug 26, 2025

Documentation preview for this PR (built with commit 48f90d7; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

new linear code and compute its coset graph.

This construction is tested in ``generators_test.py``, where the
result is compared (up to isomorphism) with the result from this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't graph isomorphism something not known to be in polynomial time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general maybe, but with these two graphs it's relatively fast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is only fast because they're equal, no? Or is the generation actually random?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vertices/edges were matched up exactly but the types were wrong. I added a loop to go through and fix them. We get equality now.

The loop slows it down a bit but it's still much better than the original.

@user202729
Copy link
Contributor

On one hand, this does improve the construction time for the user too, which is good. On the other hand, this is literally sweeping the problem under the pytest rug. (we discussed this in #40443 (comment) . The conclusion is, while warn on long pytest is technically implementable, nobody have gotten around to implement it. The advantage is that implementing it in pytest is supposedly shorter than in Sage doctest framework (although #39746 is 20 lines of difference excluding doctest, so it's not too clear whether this is the case).

The "move to database" is probably not too sustainable. At least 40KB is not too large, compared to e.g. expression.pyx being 400KB.

Also looks like test-long is failing, so this doesn't look particularly mergable any time soon. We already have some non-Python files such as minorfix.h, so I don't think it's an issue with the old build system.

@tscrim any other comment?

@orlitzky
Copy link
Contributor Author

On one hand, this does improve the construction time for the user too, which is good. On the other hand, this is literally sweeping the problem under the pytest rug. (we discussed this in #40443 (comment) . The conclusion is, while warn on long pytest is technically implementable, nobody have gotten around to implement it. The advantage is that implementing it in pytest is supposedly shorter than in Sage doctest framework (although #39746 is 20 lines of difference excluding doctest, so it's not too clear whether this is the case).

I don't disagree with any of that, and I'm not married to the test. We could just say "the original implementation was deleted, check the git log" and be done with it. There have got to be some other properties of this graph that we can verify quickly?

The "move to database" is probably not too sustainable. At least 40KB is not too large, compared to e.g. expression.pyx being 400KB.

We could leave the algorithm as a fallback in case the pickle data is not there. That would give packagers a choice between 40KB of space or 60s of time. But if we're going to solve the "slow doctest" problem, we wouldn't be able to test the fallback.

Really though I think we should wait to get to n=2 at least before making slippery slope arguments.

@orlitzky
Copy link
Contributor Author

Also looks like test-long is failing, so this doesn't look particularly mergable any time soon. We already have some non-Python files such as minorfix.h, so I don't think it's an issue with the old build system.

It's due to the missing __init__.py that is generated by meson. There may be some other way to fix it, but I'd rather just wait for it to start working in the beta1.

@user202729
Copy link
Contributor

It's due to the missing __init__.py that is generated by meson.

Hm, I think sageinspect implements some workarounds for getting source file paths. May worth taking a look.

The algorithm to construct the Shortened 000 111 extended binary Golay
code graph takes a long time, and we plan to cache its vertices and
edges. To ensure that the cached data continue to agree with the
construction, we add a new pytest file that checks the two for
isomorphism.
Add a pickled/xz'd list of vertices and edges for the Shortened 000
111 extended binary Golay code graph. It's much faster to construct
this graph from cached data than it is to do it on-the-fly.
Reimplement shortened_000_111_extended_binary_Golay_code_graph() to
load its vertices and edges from compressed, pickled data. This speeds
up the construction significantly, and eliminates a persistent CI
warning about the corresponding test being slow.
@orlitzky orlitzky force-pushed the fast-golay-construction branch from ee36ee5 to 48f90d7 Compare August 27, 2025 15:39
@orlitzky
Copy link
Contributor Author

It's due to the missing __init__.py that is generated by meson.

Hm, I think sageinspect implements some workarounds for getting source file paths. May worth taking a look.

The problem is that, if you install sage with meson, there are no "namespace packages" -- every directory has an __init__.py. With setuptools, the opposite is true; everything is a namespace package. This is not supposed to happen: you should know if your packages are namespace packages or not. (Upstream cython does not work with namespace packages yet, so the __init__.py belong there, but everyone who cares is working on meson now.)

The importlib stuff is the correct way to find data files, but it behaves differently for namespace packages. If half of sage.graphs.generators can come from one python package and the other half from another, then "the data files for sage.graphs.generators" has to reference both places. Ergo, files() is returning a MultiplexedPath where we want a single PosixPath.

I could make it handle both, but (a) by the time I'm done, the switch to meson should be finished, and (b) missing __init__.py is simply wrong for the foreseeable future, so I don't want to add workarounds for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants